Dataset statistics
| Number of variables | 36 |
|---|---|
| Number of observations | 839 |
| Missing cells | 505 |
| Missing cells (%) | 1.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 230.5 KiB |
| Average record size in memory | 281.3 B |
Variable types
| Numeric | 17 |
|---|---|
| Categorical | 18 |
| DateTime | 1 |
Has_Comments has constant value "1" | Constant |
headline has a high cardinality: 838 distinct values | High cardinality |
abstract has a high cardinality: 837 distinct values | High cardinality |
keywords has a high cardinality: 781 distinct values | High cardinality |
uniqueID has a high cardinality: 839 distinct values | High cardinality |
lead_paragraph has a high cardinality: 785 distinct values | High cardinality |
headline.main has a high cardinality: 838 distinct values | High cardinality |
df_index is highly correlated with month | High correlation |
month is highly correlated with df_index | High correlation |
TEXT_LeadParagraph_POS_PNOUN is highly correlated with TEXT_LeadParagraph_ENT_ORG and 1 other fields | High correlation |
TEXT_Keywords_POS_PNOUN is highly correlated with TEXT_Keywords_ENT_ORG and 1 other fields | High correlation |
TEXT_LeadParagraph_ENT_ORG is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keywords_ENT_ORG is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
TEXT_LeadParagraph_ENT_PERSON is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keyrwords_ENT_PERSON is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
df_index is highly correlated with month | High correlation |
month is highly correlated with df_index | High correlation |
TEXT_LeadParagraph_POS_PNOUN is highly correlated with TEXT_LeadParagraph_ENT_ORG and 2 other fields | High correlation |
TEXT_Keywords_POS_PNOUN is highly correlated with TEXT_Keywords_ENT_ORG and 1 other fields | High correlation |
TEXT_LeadParagraph_ENT_ORG is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keywords_ENT_ORG is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
TEXT_LeadParagraph_ENT_GPE is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_LeadParagraph_ENT_PERSON is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keyrwords_ENT_PERSON is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
df_index is highly correlated with month | High correlation |
month is highly correlated with df_index | High correlation |
TEXT_LeadParagraph_POS_PNOUN is highly correlated with TEXT_LeadParagraph_ENT_PERSON | High correlation |
TEXT_Keywords_POS_PNOUN is highly correlated with TEXT_Keywords_ENT_ORG and 1 other fields | High correlation |
TEXT_Keywords_ENT_ORG is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
TEXT_LeadParagraph_ENT_PERSON is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keyrwords_ENT_PERSON is highly correlated with TEXT_Keywords_POS_PNOUN | High correlation |
TEXT_Keywords_ENT_LOC is highly correlated with Has_Comments | High correlation |
newsdesk is highly correlated with section and 3 other fields | High correlation |
TEXT_LeadParagraph_ENT_LOC is highly correlated with Has_Comments | High correlation |
TEXT_LeadParagraph_ENT_FAC is highly correlated with Has_Comments | High correlation |
section is highly correlated with newsdesk and 3 other fields | High correlation |
TEXT_Keywords_ENT_NORP is highly correlated with Has_Comments | High correlation |
comment_size is highly correlated with Has_Comments | High correlation |
TEXT_headline.main_POS_NOUN is highly correlated with Has_Comments | High correlation |
TEXT_Keywords_ENT_FAC is highly correlated with Has_Comments | High correlation |
Has_Comments is highly correlated with TEXT_Keywords_ENT_LOC and 10 other fields | High correlation |
material is highly correlated with newsdesk and 2 other fields | High correlation |
print_section is highly correlated with newsdesk and 2 other fields | High correlation |
df_index is highly correlated with month | High correlation |
newsdesk is highly correlated with section and 9 other fields | High correlation |
section is highly correlated with newsdesk and 8 other fields | High correlation |
material is highly correlated with newsdesk and 3 other fields | High correlation |
word_count is highly correlated with TEXT_Keywords_POS_PNOUN and 2 other fields | High correlation |
n_comments is highly correlated with comment_size | High correlation |
print_section is highly correlated with newsdesk and 3 other fields | High correlation |
print_page is highly correlated with newsdesk and 2 other fields | High correlation |
month is highly correlated with df_index | High correlation |
comment_size is highly correlated with newsdesk and 3 other fields | High correlation |
TEXT_LeadParagraph_POS_NOUN is highly correlated with newsdesk and 3 other fields | High correlation |
TEXT_LeadParagraph_POS_PNOUN is highly correlated with newsdesk and 6 other fields | High correlation |
TEXT_Keywords_POS_NOUN is highly correlated with section and 2 other fields | High correlation |
TEXT_Keywords_POS_PNOUN is highly correlated with newsdesk and 5 other fields | High correlation |
TEXT_LeadParagraph_ENT_ORG is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keywords_ENT_ORG is highly correlated with word_count and 3 other fields | High correlation |
TEXT_LeadParagraph_ENT_NORP is highly correlated with TEXT_LeadParagraph_POS_NOUN and 1 other fields | High correlation |
TEXT_Keywords_ENT_FAC is highly correlated with TEXT_Keywords_ENT_GPE | High correlation |
TEXT_LeadParagraph_ENT_GPE is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keywords_ENT_GPE is highly correlated with newsdesk and 3 other fields | High correlation |
TEXT_Keywords_ENT_LOC is highly correlated with newsdesk | High correlation |
TEXT_LeadParagraph_ENT_PERSON is highly correlated with TEXT_LeadParagraph_POS_PNOUN | High correlation |
TEXT_Keyrwords_ENT_PERSON is highly correlated with word_count and 2 other fields | High correlation |
print_section has 252 (30.0%) missing values | Missing |
print_page has 252 (30.0%) missing values | Missing |
headline is uniformly distributed | Uniform |
abstract is uniformly distributed | Uniform |
uniqueID is uniformly distributed | Uniform |
lead_paragraph is uniformly distributed | Uniform |
headline.main is uniformly distributed | Uniform |
df_index has unique values | Unique |
uniqueID has unique values | Unique |
word_count has 19 (2.3%) zeros | Zeros |
TEXT_LeadParagraph_POS_NOUN has 27 (3.2%) zeros | Zeros |
TEXT_LeadParagraph_POS_PNOUN has 130 (15.5%) zeros | Zeros |
TEXT_Keywords_POS_NOUN has 116 (13.8%) zeros | Zeros |
TEXT_Keywords_POS_PNOUN has 37 (4.4%) zeros | Zeros |
TEXT_headline.main_POS_PNOUN has 56 (6.7%) zeros | Zeros |
TEXT_LeadParagraph_ENT_ORG has 555 (66.2%) zeros | Zeros |
TEXT_Keywords_ENT_ORG has 262 (31.2%) zeros | Zeros |
TEXT_LeadParagraph_ENT_NORP has 720 (85.8%) zeros | Zeros |
TEXT_LeadParagraph_ENT_GPE has 450 (53.6%) zeros | Zeros |
TEXT_Keywords_ENT_GPE has 406 (48.4%) zeros | Zeros |
TEXT_LeadParagraph_ENT_PERSON has 461 (54.9%) zeros | Zeros |
TEXT_Keyrwords_ENT_PERSON has 262 (31.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-01-11 22:26:53.386349 |
|---|---|
| Analysis finished | 2022-01-11 22:28:25.120151 |
| Duration | 1 minute and 31.73 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
df_index
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 839 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8483.678188 |
| Minimum | 7 |
|---|---|
| Maximum | 16771 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 862.7 |
| Q1 | 4367.5 |
| median | 8275 |
| Q3 | 12657 |
| 95-th percentile | 15936 |
| Maximum | 16771 |
| Range | 16764 |
| Interquartile range (IQR) | 8289.5 |
Descriptive statistics
| Standard deviation | 4818.381176 |
|---|---|
| Coefficient of variation (CV) | 0.567958976 |
| Kurtosis | -1.205589822 |
| Mean | 8483.678188 |
| Median Absolute Deviation (MAD) | 4139 |
| Skewness | -0.02563644036 |
| Sum | 7117806 |
| Variance | 23216797.16 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 13715 | 1 | 0.1% |
| 10178 | 1 | 0.1% |
| 11257 | 1 | 0.1% |
| 424 | 1 | 0.1% |
| 27 | 1 | 0.1% |
| 13900 | 1 | 0.1% |
| 8274 | 1 | 0.1% |
| 13698 | 1 | 0.1% |
| 5656 | 1 | 0.1% |
| 14638 | 1 | 0.1% |
| Other values (829) | 829 |
| Value | Count | Frequency (%) |
| 7 | 1 | |
| 27 | 1 | |
| 32 | 1 | |
| 48 | 1 | |
| 55 | 1 | |
| 110 | 1 | |
| 118 | 1 | |
| 122 | 1 | |
| 180 | 1 | |
| 186 | 1 |
| Value | Count | Frequency (%) |
| 16771 | 1 | |
| 16749 | 1 | |
| 16747 | 1 | |
| 16730 | 1 | |
| 16695 | 1 | |
| 16646 | 1 | |
| 16645 | 1 | |
| 16640 | 1 | |
| 16624 | 1 | |
| 16612 | 1 |
| Distinct | 42 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| OpEd | |
|---|---|
| Business | |
| Foreign | |
| Culture | 48 |
| Metro | 47 |
| Other values (37) |
Length
| Max length | 15 |
|---|---|
| Median length | 7 |
| Mean length | 7.059594756 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | Sports |
|---|---|
| 2nd row | Well |
| 3rd row | Upshot |
| 4th row | Foreign |
| 5th row | OpEd |
Common Values
| Value | Count | Frequency (%) |
| OpEd | 98 | 11.7% |
| Business | 64 | 7.6% |
| Foreign | 54 | 6.4% |
| Culture | 48 | 5.7% |
| Metro | 47 | 5.6% |
| National | 41 | 4.9% |
| Washington | 39 | 4.6% |
| Learning | 32 | 3.8% |
| Dining | 31 | 3.7% |
| RealEstate | 31 | 3.7% |
| Other values (32) | 354 |
Length
| Value | Count | Frequency (%) |
| oped | 98 | 11.6% |
| business | 64 | 7.6% |
| foreign | 54 | 6.4% |
| culture | 48 | 5.7% |
| metro | 47 | 5.6% |
| national | 41 | 4.9% |
| washington | 39 | 4.6% |
| learning | 32 | 3.8% |
| dining | 31 | 3.7% |
| realestate | 31 | 3.7% |
| Other values (33) | 359 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 31 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| U.S. | |
|---|---|
| Opinion | |
| New York | |
| World | |
| Business Day | |
| Other values (26) |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 7.483909416 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Sports |
|---|---|
| 2nd row | Well |
| 3rd row | The Upshot |
| 4th row | World |
| 5th row | Opinion |
Common Values
| Value | Count | Frequency (%) |
| U.S. | 116 | |
| Opinion | 115 | |
| New York | 63 | 7.5% |
| World | 60 | 7.2% |
| Business Day | 49 | 5.8% |
| Arts | 48 | 5.7% |
| The Learning Network | 34 | 4.1% |
| Food | 32 | 3.8% |
| Real Estate | 31 | 3.7% |
| Well | 30 | 3.6% |
| Other values (21) | 261 |
Length
| Value | Count | Frequency (%) |
| u.s | 116 | 10.3% |
| opinion | 115 | 10.2% |
| new | 63 | 5.6% |
| york | 63 | 5.6% |
| world | 60 | 5.3% |
| business | 49 | 4.3% |
| day | 49 | 4.3% |
| the | 49 | 4.3% |
| arts | 48 | 4.2% |
| network | 34 | 3.0% |
| Other values (31) | 484 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 8 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| News | |
|---|---|
| Op-Ed | |
| Review | 28 |
| Interactive Feature | 19 |
| briefing | 13 |
| Other values (3) | 24 |
Length
| Max length | 19 |
|---|---|
| Median length | 4 |
| Mean length | 4.816448153 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | News |
|---|---|
| 2nd row | News |
| 3rd row | News |
| 4th row | News |
| 5th row | Op-Ed |
Common Values
| Value | Count | Frequency (%) |
| News | 653 | |
| Op-Ed | 102 | 12.2% |
| Review | 28 | 3.3% |
| Interactive Feature | 19 | 2.3% |
| briefing | 13 | 1.5% |
| Editorial | 11 | 1.3% |
| Obituary (Obit) | 9 | 1.1% |
| News Analysis | 4 | 0.5% |
Length
Pie chart
| Value | Count | Frequency (%) |
| news | 657 | |
| op-ed | 102 | 11.7% |
| review | 28 | 3.2% |
| interactive | 19 | 2.2% |
| feature | 19 | 2.2% |
| briefing | 13 | 1.5% |
| editorial | 11 | 1.3% |
| obituary | 9 | 1.0% |
| obit | 9 | 1.0% |
| analysis | 4 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 838 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| Homes for Sale in New York and Connecticut | 2 |
|---|---|
| Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | 1 |
| Bill Gates Is the Most Interesting Man in the World | 1 |
| After 190 Years, the ‘Most Famous Bar You’ve Never Heard of’ Avoids Last Call | 1 |
| Already Had Plenty of Trump 2020? | 1 |
| Other values (833) |
Length
| Max length | 112 |
|---|---|
| Median length | 57 |
| Mean length | 54.15852205 |
| Min length | 7 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 837 ? |
|---|---|
| Unique (%) | 99.8% |
Sample
| 1st row | Mookie Betts Leads Dodgers’ Stars With a Masterly Performance |
|---|---|
| 2nd row | Taking Baths May Be Good for Your Heart |
| 3rd row | Young Men Embrace Gender Equality, but They Still Don’t Vacuum |
| 4th row | 2 Weeks, 6.5 Million Coronavirus Tests as Wuhan Nears Goal |
| 5th row | Is the Stock Market Rooting for Trump or Biden? |
Common Values
| Value | Count | Frequency (%) |
| Homes for Sale in New York and Connecticut | 2 | 0.2% |
| Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | 1 | 0.1% |
| Bill Gates Is the Most Interesting Man in the World | 1 | 0.1% |
| After 190 Years, the ‘Most Famous Bar You’ve Never Heard of’ Avoids Last Call | 1 | 0.1% |
| Already Had Plenty of Trump 2020? | 1 | 0.1% |
| How the Virus Slowed the Booming Wind Energy Business | 1 | 0.1% |
| Coronavirus Cases Rise Sharply in Prisons Even as They Plateau Nationwide | 1 | 0.1% |
| Museums Are Back, but Different: A Visitor’s Guide | 1 | 0.1% |
| Jobless Numbers Are ‘Eye-Watering’ but Understate the Crisis | 1 | 0.1% |
| For Veterans Day, Some Former Military Officers Reflect on Lessons From Their Parents | 1 | 0.1% |
| Other values (828) | 828 |
Length
| Value | Count | Frequency (%) |
| the | 312 | 4.1% |
| a | 209 | 2.7% |
| to | 163 | 2.1% |
| in | 145 | 1.9% |
| of | 143 | 1.9% |
| and | 117 | 1.5% |
| for | 88 | 1.2% |
| is | 84 | 1.1% |
| on | 64 | 0.8% |
| coronavirus | 56 | 0.7% |
| Other values (3094) | 6225 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 837 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 1 |
| Missing (%) | 0.1% |
| Memory size | 6.7 KiB |
| Look closely at this image, stripped of its caption, and join the moderated conversation about what you and other students see. | 2 |
|---|---|
| The Dodgers overwhelmed the Rays in Game 1, with Betts showing off the consistent excellence that led Los Angeles to sign him to a 12-year contract. | 1 |
| He’s everywhere, this lavender-sweatered Mister Rogers for the curious and quarantined. | 1 |
| The owner of Neir’s Tavern in Queens said it would close on Sunday, barring a “miracle.” The city, it turns out, works in mysterious ways. | 1 |
| He’s a bad show, but it’s not low-flow. | 1 |
| Other values (832) |
Length
| Max length | 344 |
|---|---|
| Median length | 130 |
| Mean length | 126.8078759 |
| Min length | 23 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 836 ? |
|---|---|
| Unique (%) | 99.8% |
Sample
| 1st row | The Dodgers overwhelmed the Rays in Game 1, with Betts showing off the consistent excellence that led Los Angeles to sign him to a 12-year contract. |
|---|---|
| 2nd row | Daily baths reduced the risk of heart disease and stroke. |
| 3rd row | New studies show traditional views persist about who does what at home, and it’s holding women back. |
| 4th row | The Chinese city where the outbreak began is seeking to test all its 11 million residents, and the pandemic has forced the fashion industry to take a hard look in the mirror. |
| 5th row | Neither. Wall Street is not as partisan as you think. |
Common Values
| Value | Count | Frequency (%) |
| Look closely at this image, stripped of its caption, and join the moderated conversation about what you and other students see. | 2 | 0.2% |
| The Dodgers overwhelmed the Rays in Game 1, with Betts showing off the consistent excellence that led Los Angeles to sign him to a 12-year contract. | 1 | 0.1% |
| He’s everywhere, this lavender-sweatered Mister Rogers for the curious and quarantined. | 1 | 0.1% |
| The owner of Neir’s Tavern in Queens said it would close on Sunday, barring a “miracle.” The city, it turns out, works in mysterious ways. | 1 | 0.1% |
| He’s a bad show, but it’s not low-flow. | 1 | 0.1% |
| Renewable energy developers have struggled to finish projects as the pandemic disrupts construction and global supply chains. | 1 | 0.1% |
| Prison officials have been reluctant to do widespread virus testing even as infection rates are escalating. | 1 | 0.1% |
| The visitors may be masked, but the art is gradually coming into full view. | 1 | 0.1% |
| With 4.4 million added last week, the five-week total passed 26 million. The struggle by states to field claims has hampered economic recovery. | 1 | 0.1% |
| The values that shaped them include leadership, optimism and charting your own course. | 1 | 0.1% |
| Other values (827) | 827 |
Length
| Value | Count | Frequency (%) |
| the | 1065 | 6.1% |
| a | 537 | 3.1% |
| to | 471 | 2.7% |
| and | 459 | 2.6% |
| of | 454 | 2.6% |
| in | 396 | 2.3% |
| is | 184 | 1.1% |
| for | 164 | 0.9% |
| are | 141 | 0.8% |
| with | 135 | 0.8% |
| Other values (5169) | 13496 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 781 |
|---|---|
| Distinct (%) | 93.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| [] | 33 |
|---|---|
| ['Crossword Puzzles'] | 13 |
| ['Coronavirus (2019-nCoV)'] | 10 |
| ['New York City'] | 4 |
| ['Trump, Donald J', 'Presidential Election of 2020', 'United States Politics and Government'] | 2 |
| Other values (776) |
Length
| Max length | 1381 |
|---|---|
| Median length | 159 |
| Mean length | 170.0059595 |
| Min length | 2 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 775 ? |
|---|---|
| Unique (%) | 92.4% |
Sample
| 1st row | ['Baseball', 'World Series', 'Boston Red Sox', 'Los Angeles Dodgers', 'Tampa Bay Rays', 'Bellinger, Cody (1995- )', 'Betts, Mookie (1992- )', 'Kershaw, Clayton'] |
|---|---|
| 2nd row | ['Bathing and Showering', 'Heart', 'Blood Pressure'] |
| 3rd row | ['Work-Life Balance', 'Women and Girls', 'Research', 'Men and Boys', 'Parenting', 'Labor and Jobs', 'United States'] |
| 4th row | ['Coronavirus (2019-nCoV)'] |
| 5th row | ['Presidential Election of 2020', 'United States Economy', 'Stocks and Bonds'] |
Common Values
| Value | Count | Frequency (%) |
| [] | 33 | 3.9% |
| ['Crossword Puzzles'] | 13 | 1.5% |
| ['Coronavirus (2019-nCoV)'] | 10 | 1.2% |
| ['New York City'] | 4 | 0.5% |
| ['Trump, Donald J', 'Presidential Election of 2020', 'United States Politics and Government'] | 2 | 0.2% |
| ['Television', 'Billions (TV Program)'] | 2 | 0.2% |
| ['Bicycles and Bicycling', 'Women and Girls', 'Coronavirus (2019-nCoV)', 'Traffic Accidents and Safety', 'Commuting', 'Citi Bike', 'New York University', 'Transportation Alternatives', 'New York City'] | 1 | 0.1% |
| ['Museums', 'Coronavirus Reopenings', 'AMERICAN MUSEUM OF NATURAL HISTORY', 'Dallas Museum of Art', 'Gardner, Isabella Stewart, Museum', 'Los Angeles County Museum of Art', 'Metropolitan Museum of Art', 'Museum of Fine Arts (Boston)', 'Museum of Modern Art', 'National Gallery of Art', 'Whitney Museum of American Art', 'Perez, Jorge M, Art Museum of Miami-Dade County', 'Museum of Fine Arts (Houston)', 'Cleveland Museum of Art', 'Sirmans, Franklin', 'Tinterow, Gary', 'Weinberg, Adam D', 'Zumthor, Peter'] | 1 | 0.1% |
| ['Coronavirus Aid, Relief, and Economic Security Act (2020)', 'Coronavirus (2019-nCoV)', 'Unemployment', 'Unemployment Insurance', 'Layoffs and Job Reductions', 'Labor and Jobs', 'United States Economy', 'States (US)'] | 1 | 0.1% |
| ['United States Defense and Military Forces', 'Children and Childhood', 'Careers and Professions', 'Parenting', 'Families and Family Life', 'Defense Department', 'United States Army', 'United States Marine Corps', 'United States Navy'] | 1 | 0.1% |
| Other values (771) | 771 |
Length
| Value | Count | Frequency (%) |
| and | 1310 | 7.9% |
| coronavirus | 352 | 2.1% |
| 2019-ncov | 289 | 1.8% |
| states | 277 | 1.7% |
| united | 267 | 1.6% |
| of | 228 | 1.4% |
| government | 190 | 1.2% |
| politics | 190 | 1.2% |
| j | 150 | 0.9% |
| 150 | 0.9% | |
| Other values (3593) | 13090 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 642 |
|---|---|
| Distinct (%) | 76.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1343.942789 |
| Minimum | 0 |
|---|---|
| Maximum | 11020 |
| Zeros | 19 |
| Zeros (%) | 2.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 268.7 |
| Q1 | 917.5 |
| median | 1203 |
| Q3 | 1497.5 |
| 95-th percentile | 2539.2 |
| Maximum | 11020 |
| Range | 11020 |
| Interquartile range (IQR) | 580 |
Descriptive statistics
| Standard deviation | 1024.445849 |
|---|---|
| Coefficient of variation (CV) | 0.7622689428 |
| Kurtosis | 27.90786588 |
| Mean | 1343.942789 |
| Median Absolute Deviation (MAD) | 291 |
| Skewness | 4.331902981 |
| Sum | 1127568 |
| Variance | 1049489.297 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 19 | 2.3% |
| 1439 | 5 | 0.6% |
| 1226 | 5 | 0.6% |
| 1270 | 4 | 0.5% |
| 1026 | 4 | 0.5% |
| 1369 | 4 | 0.5% |
| 1424 | 3 | 0.4% |
| 1048 | 3 | 0.4% |
| 985 | 3 | 0.4% |
| 1286 | 3 | 0.4% |
| Other values (632) | 786 |
| Value | Count | Frequency (%) |
| 0 | 19 | |
| 10 | 1 | 0.1% |
| 16 | 2 | 0.2% |
| 74 | 1 | 0.1% |
| 77 | 1 | 0.1% |
| 115 | 1 | 0.1% |
| 118 | 1 | 0.1% |
| 133 | 1 | 0.1% |
| 136 | 1 | 0.1% |
| 137 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 11020 | 1 | |
| 10329 | 1 | |
| 8801 | 1 | |
| 8223 | 1 | |
| 7815 | 1 | |
| 6701 | 1 | |
| 6522 | 1 | |
| 6325 | 1 | |
| 5919 | 1 | |
| 5894 | 1 |
pub_date
Date
| Distinct | 834 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| Minimum | 2020-01-01 10:00:01+00:00 |
|---|---|
| Maximum | 2020-12-31 10:01:02+00:00 |
| Distinct | 408 |
|---|---|
| Distinct (%) | 48.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 309.0441001 |
| Minimum | 1 |
|---|---|
| Maximum | 5702 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 20 |
| median | 92 |
| Q3 | 310.5 |
| 95-th percentile | 1333.7 |
| Maximum | 5702 |
| Range | 5701 |
| Interquartile range (IQR) | 290.5 |
Descriptive statistics
| Standard deviation | 557.7115825 |
|---|---|
| Coefficient of variation (CV) | 1.804634297 |
| Kurtosis | 19.57107595 |
| Mean | 309.0441001 |
| Median Absolute Deviation (MAD) | 85 |
| Skewness | 3.744530566 |
| Sum | 259288 |
| Variance | 311042.2093 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 19 | 2.3% |
| 8 | 18 | 2.1% |
| 3 | 14 | 1.7% |
| 12 | 14 | 1.7% |
| 6 | 14 | 1.7% |
| 1 | 13 | 1.5% |
| 4 | 13 | 1.5% |
| 7 | 11 | 1.3% |
| 10 | 11 | 1.3% |
| 15 | 10 | 1.2% |
| Other values (398) | 702 |
| Value | Count | Frequency (%) |
| 1 | 13 | |
| 2 | 19 | |
| 3 | 14 | |
| 4 | 13 | |
| 5 | 8 | |
| 6 | 14 | |
| 7 | 11 | |
| 8 | 18 | |
| 9 | 10 | |
| 10 | 11 |
| Value | Count | Frequency (%) |
| 5702 | 1 | |
| 3745 | 1 | |
| 3707 | 1 | |
| 3595 | 1 | |
| 3425 | 1 | |
| 3145 | 1 | |
| 3131 | 1 | |
| 2570 | 1 | |
| 2556 | 1 | |
| 2526 | 1 |
| Distinct | 839 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| nyt://article/905b23aa-991f-5f02-9a7b-f0a09a7cba27 | 1 |
|---|---|
| nyt://article/9872c5b0-6015-5c99-905f-7de9e3d5b107 | 1 |
| nyt://article/e109ae37-4553-513c-b51e-5409f725716f | 1 |
| nyt://article/7c07b95a-1dd9-5b0d-b320-c84e1f2cc979 | 1 |
| nyt://article/d5686346-4b09-56b4-89f3-fe1c9c291f13 | 1 |
| Other values (834) |
Length
| Max length | 54 |
|---|---|
| Median length | 50 |
| Mean length | 50.09058403 |
| Min length | 50 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 839 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | nyt://article/905b23aa-991f-5f02-9a7b-f0a09a7cba27 |
|---|---|
| 2nd row | nyt://article/40e910f7-8d5f-5beb-9e7c-7ed27ac1597c |
| 3rd row | nyt://article/d96041df-e929-51ad-8dd1-88bb84964888 |
| 4th row | nyt://article/4154be08-a296-5881-8342-e9f223dc484c |
| 5th row | nyt://article/afed02eb-f035-542a-8791-c243893e65d7 |
Common Values
| Value | Count | Frequency (%) |
| nyt://article/905b23aa-991f-5f02-9a7b-f0a09a7cba27 | 1 | 0.1% |
| nyt://article/9872c5b0-6015-5c99-905f-7de9e3d5b107 | 1 | 0.1% |
| nyt://article/e109ae37-4553-513c-b51e-5409f725716f | 1 | 0.1% |
| nyt://article/7c07b95a-1dd9-5b0d-b320-c84e1f2cc979 | 1 | 0.1% |
| nyt://article/d5686346-4b09-56b4-89f3-fe1c9c291f13 | 1 | 0.1% |
| nyt://article/e7d0466b-8190-5e04-a0f8-a94d603bcf5d | 1 | 0.1% |
| nyt://article/f9777004-2651-556d-86de-0e409359f58f | 1 | 0.1% |
| nyt://article/a7c443c8-1e8f-5426-b6d6-54347c448ece | 1 | 0.1% |
| nyt://article/57e0985e-90c8-5007-8cd0-527d7d64a5f8 | 1 | 0.1% |
| nyt://article/39ff4653-14ed-5b5f-b599-317f9fb44c1a | 1 | 0.1% |
| Other values (829) | 829 |
Length
| Value | Count | Frequency (%) |
| nyt://article/905b23aa-991f-5f02-9a7b-f0a09a7cba27 | 1 | 0.1% |
| nyt://article/8ec51655-7bc0-552c-bdc0-f620313e7e63 | 1 | 0.1% |
| nyt://article/42043d2b-a73c-5c31-ba4e-457392e735c2 | 1 | 0.1% |
| nyt://article/d96041df-e929-51ad-8dd1-88bb84964888 | 1 | 0.1% |
| nyt://article/4154be08-a296-5881-8342-e9f223dc484c | 1 | 0.1% |
| nyt://article/afed02eb-f035-542a-8791-c243893e65d7 | 1 | 0.1% |
| nyt://article/80b0675a-7867-57f2-919e-8642e939f747 | 1 | 0.1% |
| nyt://article/0e73f3b9-b692-572f-85cd-6ab8831dfffc | 1 | 0.1% |
| nyt://article/6ed0d121-21ca-54d5-b0cf-2e7cef434eb5 | 1 | 0.1% |
| nyt://article/ef8de76b-cb6b-55e2-9a1c-1e8165c20cb7 | 1 | 0.1% |
| Other values (829) | 829 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 16 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 252 |
| Missing (%) | 30.0% |
| Memory size | 6.7 KiB |
| A | |
|---|---|
| B | |
| D | |
| C | |
| SR | 17 |
| Other values (11) |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.170357751 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | B |
|---|---|
| 2nd row | D |
| 3rd row | B |
| 4th row | A |
| 5th row | A |
Common Values
| Value | Count | Frequency (%) |
| A | 289 | |
| B | 82 | 9.8% |
| D | 64 | 7.6% |
| C | 45 | 5.4% |
| SR | 17 | 2.0% |
| RE | 16 | 1.9% |
| MM | 15 | 1.8% |
| AR | 13 | 1.5% |
| MB | 13 | 1.5% |
| BR | 11 | 1.3% |
| Other values (6) | 22 | 2.6% |
| (Missing) | 252 |
Length
| Value | Count | Frequency (%) |
| a | 289 | |
| b | 82 | 14.0% |
| d | 64 | 10.9% |
| c | 45 | 7.7% |
| sr | 17 | 2.9% |
| re | 16 | 2.7% |
| mm | 15 | 2.6% |
| ar | 13 | 2.2% |
| mb | 13 | 2.2% |
| br | 11 | 1.9% |
| Other values (6) | 22 | 3.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 32 |
|---|---|
| Distinct (%) | 5.5% |
| Missing | 252 |
| Missing (%) | 30.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.991482112 |
| Minimum | 1 |
|---|---|
| Maximum | 36 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 6 |
| Q3 | 16 |
| 95-th percentile | 26 |
| Maximum | 36 |
| Range | 35 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.603808726 |
|---|---|
| Coefficient of variation (CV) | 0.9568843733 |
| Kurtosis | -0.4803899181 |
| Mean | 8.991482112 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.8861114949 |
| Sum | 5278 |
| Variance | 74.02552459 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 158 | |
| 6 | 42 | 5.0% |
| 2 | 39 | 4.6% |
| 3 | 32 | 3.8% |
| 4 | 32 | 3.8% |
| 10 | 19 | 2.3% |
| 5 | 19 | 2.3% |
| 23 | 19 | 2.3% |
| 12 | 18 | 2.1% |
| 8 | 18 | 2.1% |
| Other values (22) | 191 | |
| (Missing) | 252 |
| Value | Count | Frequency (%) |
| 1 | 158 | |
| 2 | 39 | 4.6% |
| 3 | 32 | 3.8% |
| 4 | 32 | 3.8% |
| 5 | 19 | 2.3% |
| 6 | 42 | 5.0% |
| 7 | 17 | 2.0% |
| 8 | 18 | 2.1% |
| 9 | 15 | 1.8% |
| 10 | 19 | 2.3% |
| Value | Count | Frequency (%) |
| 36 | 1 | 0.1% |
| 35 | 1 | 0.1% |
| 30 | 4 | 0.5% |
| 29 | 2 | 0.2% |
| 28 | 2 | 0.2% |
| 27 | 15 | |
| 26 | 9 | |
| 25 | 4 | 0.5% |
| 24 | 7 | 0.8% |
| 23 | 19 |
| Distinct | 785 |
|---|---|
| Distinct (%) | 93.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 15 | |
| [Want to get New York Today by email? Here’s the sign-up.] | 12 |
| Listen and subscribe to our podcast from your mobile device:Via Apple Podcasts | Via Spotify | Via Stitcher | 9 |
| Times Insider explains who we are and what we do, and delivers behind-the-scenes insights into how our journalism comes together. | 5 |
| Click on the slide show to see this week’s featured properties: | 4 |
| Other values (780) |
Length
| Max length | 1162 |
|---|---|
| Median length | 232 |
| Mean length | 239.2443385 |
| Min length | 0 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 770 ? |
|---|---|
| Unique (%) | 91.8% |
Sample
| 1st row | ARLINGTON, Texas — Mookie Betts was 24 years old when he wondered if he would ever be any better. He had just finished as runner-up to Mike Trout for the American League Most Valuable Player Award in 2016, with a season full of hits and homers and steals and defensive excellence. What could he do next? |
|---|---|
| 2nd row | Taking frequent baths may reduce the risk for cardiovascular disease, new research suggests. |
| 3rd row | Young people today have become much more open-minded about gender roles — it shows up in their attitudes about pronouns, politics and sports. But in one area, change has been minimal. They are holding on to traditional views about who does what at home. |
| 4th row | 新冠病毒疫情最新消息 |
| 5th row | For months the S&P 500 rose this year — despite a deadly pandemic, the resulting economic devastation and the rise of a Democratic Party increasingly sympathetic to democratic socialism. Then, this month, with Joe Biden doing well in the polls, stock prices finally stumbled. |
Common Values
| Value | Count | Frequency (%) |
| 15 | 1.8% | |
| [Want to get New York Today by email? Here’s the sign-up.] | 12 | 1.4% |
| Listen and subscribe to our podcast from your mobile device:Via Apple Podcasts | Via Spotify | Via Stitcher | 9 | 1.1% |
| Times Insider explains who we are and what we do, and delivers behind-the-scenes insights into how our journalism comes together. | 5 | 0.6% |
| Click on the slide show to see this week’s featured properties: | 4 | 0.5% |
| Dear Diary: | 4 | 0.5% |
| This briefing has ended. Follow our latest coverage of the coronavirus pandemic. | 3 | 0.4% |
| Welcome to Best of Late Night, a rundown of the previous night’s highlights that lets you sleep — and lets us get paid to watch comedy. We’re all stuck at home at the moment, so here are the 50 best movies on Netflix right now. | 3 | 0.4% |
| [Follow the DNC Live: Biden’s speech, schedule, start time, streaming and more.] | 2 | 0.2% |
| [Read our live updates on President Trump’s coronavirus diagnosis.] | 2 | 0.2% |
| Other values (775) | 780 |
Length
| Value | Count | Frequency (%) |
| the | 1849 | 5.5% |
| a | 1026 | 3.0% |
| of | 940 | 2.8% |
| to | 842 | 2.5% |
| and | 828 | 2.4% |
| in | 761 | 2.2% |
| on | 354 | 1.0% |
| that | 339 | 1.0% |
| — | 317 | 0.9% |
| for | 304 | 0.9% |
| Other values (7763) | 26295 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 838 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| Homes for Sale in New York and Connecticut | 2 |
|---|---|
| Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | 1 |
| Bill Gates Is the Most Interesting Man in the World | 1 |
| After 190 Years, the ‘Most Famous Bar You’ve Never Heard of’ Avoids Last Call | 1 |
| Already Had Plenty of Trump 2020? | 1 |
| Other values (833) |
Length
| Max length | 112 |
|---|---|
| Median length | 57 |
| Mean length | 54.15852205 |
| Min length | 7 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 837 ? |
|---|---|
| Unique (%) | 99.8% |
Sample
| 1st row | Mookie Betts Leads Dodgers’ Stars With a Masterly Performance |
|---|---|
| 2nd row | Taking Baths May Be Good for Your Heart |
| 3rd row | Young Men Embrace Gender Equality, but They Still Don’t Vacuum |
| 4th row | 2 Weeks, 6.5 Million Coronavirus Tests as Wuhan Nears Goal |
| 5th row | Is the Stock Market Rooting for Trump or Biden? |
Common Values
| Value | Count | Frequency (%) |
| Homes for Sale in New York and Connecticut | 2 | 0.2% |
| Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | 1 | 0.1% |
| Bill Gates Is the Most Interesting Man in the World | 1 | 0.1% |
| After 190 Years, the ‘Most Famous Bar You’ve Never Heard of’ Avoids Last Call | 1 | 0.1% |
| Already Had Plenty of Trump 2020? | 1 | 0.1% |
| How the Virus Slowed the Booming Wind Energy Business | 1 | 0.1% |
| Coronavirus Cases Rise Sharply in Prisons Even as They Plateau Nationwide | 1 | 0.1% |
| Museums Are Back, but Different: A Visitor’s Guide | 1 | 0.1% |
| Jobless Numbers Are ‘Eye-Watering’ but Understate the Crisis | 1 | 0.1% |
| For Veterans Day, Some Former Military Officers Reflect on Lessons From Their Parents | 1 | 0.1% |
| Other values (828) | 828 |
Length
| Value | Count | Frequency (%) |
| the | 312 | 4.1% |
| a | 209 | 2.7% |
| to | 163 | 2.1% |
| in | 145 | 1.9% |
| of | 143 | 1.9% |
| and | 117 | 1.5% |
| for | 88 | 1.2% |
| is | 84 | 1.1% |
| on | 64 | 0.8% |
| coronavirus | 56 | 0.7% |
| Other values (3094) | 6225 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 12 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.383790226 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.426420502 |
|---|---|
| Coefficient of variation (CV) | 0.5367376402 |
| Kurtosis | -1.230186716 |
| Mean | 6.383790226 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.07258068568 |
| Sum | 5356 |
| Variance | 11.74035745 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 88 | |
| 4 | 83 | |
| 9 | 80 | |
| 3 | 79 | |
| 10 | 76 | |
| 12 | 67 | |
| 1 | 67 | |
| 2 | 66 | |
| 8 | 63 | |
| 6 | 62 | |
| Other values (2) | 108 |
| Value | Count | Frequency (%) |
| 1 | 67 | |
| 2 | 66 | |
| 3 | 79 | |
| 4 | 83 | |
| 5 | 88 | |
| 6 | 62 | |
| 7 | 50 | |
| 8 | 63 | |
| 9 | 80 | |
| 10 | 76 |
| Value | Count | Frequency (%) |
| 12 | 67 | |
| 11 | 58 | |
| 10 | 76 | |
| 9 | 80 | |
| 8 | 63 | |
| 7 | 50 | |
| 6 | 62 | |
| 5 | 88 | |
| 4 | 83 | |
| 3 | 79 |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 1 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 839 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 839 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 KiB |
| S | |
|---|---|
| M | |
| L |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | S |
|---|---|
| 2nd row | S |
| 3rd row | L |
| 4th row | S |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| S | 428 | |
| M | 213 | |
| L | 198 |
Length
Pie chart
| Value | Count | Frequency (%) |
| s | 428 | |
| m | 213 | |
| l | 198 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 37 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.05363528 |
| Minimum | 0 |
|---|---|
| Maximum | 46 |
| Zeros | 27 |
| Zeros (%) | 3.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 9 |
| Q3 | 12 |
| 95-th percentile | 19 |
| Maximum | 46 |
| Range | 46 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 6.063564097 |
|---|---|
| Coefficient of variation (CV) | 0.6697380566 |
| Kurtosis | 4.505171217 |
| Mean | 9.05363528 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.419551795 |
| Sum | 7596 |
| Variance | 36.76680956 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 70 | 8.3% |
| 11 | 61 | 7.3% |
| 8 | 58 | 6.9% |
| 4 | 56 | 6.7% |
| 7 | 56 | 6.7% |
| 6 | 56 | 6.7% |
| 10 | 55 | 6.6% |
| 12 | 54 | 6.4% |
| 3 | 48 | 5.7% |
| 5 | 42 | 5.0% |
| Other values (27) | 283 |
| Value | Count | Frequency (%) |
| 0 | 27 | 3.2% |
| 1 | 34 | |
| 2 | 39 | |
| 3 | 48 | |
| 4 | 56 | |
| 5 | 42 | |
| 6 | 56 | |
| 7 | 56 | |
| 8 | 58 | |
| 9 | 70 |
| Value | Count | Frequency (%) |
| 46 | 1 | |
| 44 | 1 | |
| 40 | 1 | |
| 35 | 1 | |
| 34 | 1 | |
| 33 | 1 | |
| 31 | 2 | |
| 30 | 2 | |
| 29 | 1 | |
| 27 | 2 |
TEXT_LeadParagraph_POS_PNOUN
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 21 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.468414779 |
| Minimum | 0 |
|---|---|
| Maximum | 24 |
| Zeros | 130 |
| Zeros (%) | 15.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 12 |
| Maximum | 24 |
| Range | 24 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.787133252 |
|---|---|
| Coefficient of variation (CV) | 0.8475339554 |
| Kurtosis | 2.061353635 |
| Mean | 4.468414779 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.226488208 |
| Sum | 3749 |
| Variance | 14.34237827 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 130 | |
| 3 | 107 | |
| 2 | 102 | |
| 5 | 85 | |
| 4 | 84 | |
| 6 | 74 | |
| 1 | 57 | |
| 7 | 56 | |
| 8 | 32 | 3.8% |
| 9 | 30 | 3.6% |
| Other values (11) | 82 |
| Value | Count | Frequency (%) |
| 0 | 130 | |
| 1 | 57 | |
| 2 | 102 | |
| 3 | 107 | |
| 4 | 84 | |
| 5 | 85 | |
| 6 | 74 | |
| 7 | 56 | |
| 8 | 32 | 3.8% |
| 9 | 30 | 3.6% |
| Value | Count | Frequency (%) |
| 24 | 1 | 0.1% |
| 21 | 1 | 0.1% |
| 19 | 3 | 0.4% |
| 17 | 3 | 0.4% |
| 16 | 4 | 0.5% |
| 15 | 8 | |
| 14 | 9 | |
| 13 | 8 | |
| 12 | 7 | |
| 11 | 13 |
| Distinct | 15 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.991656734 |
| Minimum | 0 |
|---|---|
| Maximum | 17 |
| Zeros | 116 |
| Zeros (%) | 13.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 7 |
| Maximum | 17 |
| Range | 17 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.369994813 |
|---|---|
| Coefficient of variation (CV) | 0.7922014535 |
| Kurtosis | 1.980832925 |
| Mean | 2.991656734 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.063436947 |
| Sum | 2510 |
| Variance | 5.616875414 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 151 | |
| 1 | 140 | |
| 3 | 126 | |
| 0 | 116 | |
| 4 | 114 | |
| 5 | 72 | |
| 6 | 49 | 5.8% |
| 7 | 31 | 3.7% |
| 8 | 22 | 2.6% |
| 9 | 8 | 1.0% |
| Other values (5) | 10 | 1.2% |
| Value | Count | Frequency (%) |
| 0 | 116 | |
| 1 | 140 | |
| 2 | 151 | |
| 3 | 126 | |
| 4 | 114 | |
| 5 | 72 | |
| 6 | 49 | 5.8% |
| 7 | 31 | 3.7% |
| 8 | 22 | 2.6% |
| 9 | 8 | 1.0% |
| Value | Count | Frequency (%) |
| 17 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| 13 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 10 | 6 | 0.7% |
| 9 | 8 | 1.0% |
| 8 | 22 | 2.6% |
| 7 | 31 | |
| 6 | 49 | |
| 5 | 72 |
TEXT_Keywords_POS_PNOUN
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 49 |
|---|---|
| Distinct (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.78307509 |
| Minimum | 0 |
|---|---|
| Maximum | 146 |
| Zeros | 37 |
| Zeros (%) | 4.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 7 |
| median | 12 |
| Q3 | 19 |
| 95-th percentile | 30 |
| Maximum | 146 |
| Range | 146 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 10.68615635 |
|---|---|
| Coefficient of variation (CV) | 0.7753100292 |
| Kurtosis | 32.24330063 |
| Mean | 13.78307509 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.515649292 |
| Sum | 11564 |
| Variance | 114.1939375 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 44 | 5.2% |
| 9 | 44 | 5.2% |
| 12 | 43 | 5.1% |
| 2 | 42 | 5.0% |
| 10 | 40 | 4.8% |
| 0 | 37 | 4.4% |
| 8 | 36 | 4.3% |
| 13 | 35 | 4.2% |
| 7 | 35 | 4.2% |
| 5 | 32 | 3.8% |
| Other values (39) | 451 |
| Value | Count | Frequency (%) |
| 0 | 37 | |
| 1 | 4 | 0.5% |
| 2 | 42 | |
| 3 | 16 | 1.9% |
| 4 | 28 | |
| 5 | 32 | |
| 6 | 44 | |
| 7 | 35 | |
| 8 | 36 | |
| 9 | 44 |
| Value | Count | Frequency (%) |
| 146 | 1 | |
| 83 | 1 | |
| 81 | 1 | |
| 73 | 1 | |
| 57 | 1 | |
| 53 | 1 | |
| 51 | 1 | |
| 45 | 1 | |
| 44 | 1 | |
| 41 | 2 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | |
| 3 | 31 |
| 4 | 6 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 0 |
| 4th row | 2 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 354 | |
| 1 | 327 | |
| 2 | 121 | 14.4% |
| 3 | 31 | 3.7% |
| 4 | 6 | 0.7% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 354 | |
| 1 | 327 | |
| 2 | 121 | 14.4% |
| 3 | 31 | 3.7% |
| 4 | 6 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.567342074 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 56 |
| Zeros (%) | 6.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 7 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.030185447 |
|---|---|
| Coefficient of variation (CV) | 0.569103104 |
| Kurtosis | -0.3618087253 |
| Mean | 3.567342074 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.2616108381 |
| Sum | 2993 |
| Variance | 4.121652951 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 153 | |
| 2 | 144 | |
| 4 | 136 | |
| 5 | 132 | |
| 6 | 75 | |
| 1 | 74 | |
| 0 | 56 | 6.7% |
| 7 | 43 | 5.1% |
| 8 | 19 | 2.3% |
| 9 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 56 | 6.7% |
| 1 | 74 | |
| 2 | 144 | |
| 3 | 153 | |
| 4 | 136 | |
| 5 | 132 | |
| 6 | 75 | |
| 7 | 43 | 5.1% |
| 8 | 19 | 2.3% |
| 9 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 10 | 2 | 0.2% |
| 9 | 5 | 0.6% |
| 8 | 19 | 2.3% |
| 7 | 43 | 5.1% |
| 6 | 75 | |
| 5 | 132 | |
| 4 | 136 | |
| 3 | 153 | |
| 2 | 144 | |
| 1 | 74 |
TEXT_LeadParagraph_ENT_ORG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 6 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4541120381 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 555 |
| Zeros (%) | 66.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7488920324 |
|---|---|
| Coefficient of variation (CV) | 1.649134948 |
| Kurtosis | 5.213057557 |
| Mean | 0.4541120381 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.028195752 |
| Sum | 381 |
| Variance | 0.5608392762 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 555 | |
| 1 | 212 | 25.3% |
| 2 | 55 | 6.6% |
| 3 | 10 | 1.2% |
| 4 | 6 | 0.7% |
| 5 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 555 | |
| 1 | 212 | 25.3% |
| 2 | 55 | 6.6% |
| 3 | 10 | 1.2% |
| 4 | 6 | 0.7% |
| 5 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 5 | 1 | 0.1% |
| 4 | 6 | 0.7% |
| 3 | 10 | 1.2% |
| 2 | 55 | 6.6% |
| 1 | 212 | 25.3% |
| 0 | 555 |
TEXT_Keywords_ENT_ORG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 13 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.497020262 |
| Minimum | 0 |
|---|---|
| Maximum | 14 |
| Zeros | 262 |
| Zeros (%) | 31.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.633386218 |
|---|---|
| Coefficient of variation (CV) | 1.09109159 |
| Kurtosis | 7.991717462 |
| Mean | 1.497020262 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.087866594 |
| Sum | 1256 |
| Variance | 2.667950538 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 262 | |
| 1 | 240 | |
| 2 | 161 | |
| 3 | 106 | |
| 4 | 29 | 3.5% |
| 5 | 22 | 2.6% |
| 6 | 6 | 0.7% |
| 7 | 4 | 0.5% |
| 8 | 3 | 0.4% |
| 9 | 3 | 0.4% |
| Other values (3) | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 262 | |
| 1 | 240 | |
| 2 | 161 | |
| 3 | 106 | |
| 4 | 29 | 3.5% |
| 5 | 22 | 2.6% |
| 6 | 6 | 0.7% |
| 7 | 4 | 0.5% |
| 8 | 3 | 0.4% |
| 9 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 14 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 10 | 1 | 0.1% |
| 9 | 3 | 0.4% |
| 8 | 3 | 0.4% |
| 7 | 4 | 0.5% |
| 6 | 6 | 0.7% |
| 5 | 22 | 2.6% |
| 4 | 29 | 3.5% |
| 3 | 106 |
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1811680572 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 720 |
| Zeros (%) | 85.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5129148944 |
|---|---|
| Coefficient of variation (CV) | 2.83115524 |
| Kurtosis | 21.02227048 |
| Mean | 0.1811680572 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.910939011 |
| Sum | 152 |
| Variance | 0.2630816889 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 720 | |
| 1 | 96 | 11.4% |
| 2 | 17 | 2.0% |
| 3 | 3 | 0.4% |
| 4 | 2 | 0.2% |
| 5 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 720 | |
| 1 | 96 | 11.4% |
| 2 | 17 | 2.0% |
| 3 | 3 | 0.4% |
| 4 | 2 | 0.2% |
| 5 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 5 | 1 | 0.1% |
| 4 | 2 | 0.2% |
| 3 | 3 | 0.4% |
| 2 | 17 | 2.0% |
| 1 | 96 | 11.4% |
| 0 | 720 |
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | 52 |
| 2 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 785 | |
| 1 | 52 | 6.2% |
| 2 | 2 | 0.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 785 | |
| 1 | 52 | 6.2% |
| 2 | 2 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | 31 |
| 2 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 806 | |
| 1 | 31 | 3.7% |
| 2 | 2 | 0.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 806 | |
| 1 | 31 | 3.7% |
| 2 | 2 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | 8 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 831 | |
| 1 | 8 | 1.0% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 831 | |
| 1 | 8 | 1.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.7604290822 |
| Minimum | 0 |
|---|---|
| Maximum | 7 |
| Zeros | 450 |
| Zeros (%) | 53.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.07079156 |
|---|---|
| Coefficient of variation (CV) | 1.408141252 |
| Kurtosis | 5.343947323 |
| Mean | 0.7604290822 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.955458291 |
| Sum | 638 |
| Variance | 1.146594565 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 450 | |
| 1 | 235 | |
| 2 | 95 | 11.3% |
| 3 | 38 | 4.5% |
| 4 | 12 | 1.4% |
| 5 | 6 | 0.7% |
| 7 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 450 | |
| 1 | 235 | |
| 2 | 95 | 11.3% |
| 3 | 38 | 4.5% |
| 4 | 12 | 1.4% |
| 5 | 6 | 0.7% |
| 7 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 7 | 3 | 0.4% |
| 5 | 6 | 0.7% |
| 4 | 12 | 1.4% |
| 3 | 38 | 4.5% |
| 2 | 95 | 11.3% |
| 1 | 235 | |
| 0 | 450 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8915375447 |
| Minimum | 0 |
|---|---|
| Maximum | 24 |
| Zeros | 406 |
| Zeros (%) | 48.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 24 |
| Range | 24 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.480634707 |
|---|---|
| Coefficient of variation (CV) | 1.6607654 |
| Kurtosis | 83.79575575 |
| Mean | 0.8915375447 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 6.762831153 |
| Sum | 748 |
| Variance | 2.192279137 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 406 | |
| 1 | 257 | |
| 2 | 115 | 13.7% |
| 3 | 40 | 4.8% |
| 4 | 11 | 1.3% |
| 7 | 4 | 0.5% |
| 5 | 2 | 0.2% |
| 14 | 1 | 0.1% |
| 13 | 1 | 0.1% |
| 24 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 406 | |
| 1 | 257 | |
| 2 | 115 | 13.7% |
| 3 | 40 | 4.8% |
| 4 | 11 | 1.3% |
| 5 | 2 | 0.2% |
| 7 | 4 | 0.5% |
| 8 | 1 | 0.1% |
| 13 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 24 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| 13 | 1 | 0.1% |
| 8 | 1 | 0.1% |
| 7 | 4 | 0.5% |
| 5 | 2 | 0.2% |
| 4 | 11 | 1.3% |
| 3 | 40 | 4.8% |
| 2 | 115 | |
| 1 | 257 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | 42 |
| 2 | 4 |
| 3 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 792 | |
| 1 | 42 | 5.0% |
| 2 | 4 | 0.5% |
| 3 | 1 | 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 792 | |
| 1 | 42 | 5.0% |
| 2 | 4 | 0.5% |
| 3 | 1 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 | 29 |
| 2 | 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 807 | |
| 1 | 29 | 3.5% |
| 2 | 3 | 0.4% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 807 | |
| 1 | 29 | 3.5% |
| 2 | 3 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
TEXT_LeadParagraph_ENT_PERSON
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6507747318 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 461 |
| Zeros (%) | 54.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9065375107 |
|---|---|
| Coefficient of variation (CV) | 1.393012768 |
| Kurtosis | 4.91928856 |
| Mean | 0.6507747318 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.882145386 |
| Sum | 546 |
| Variance | 0.8218102583 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 461 | |
| 1 | 263 | |
| 2 | 82 | 9.8% |
| 3 | 20 | 2.4% |
| 4 | 7 | 0.8% |
| 5 | 5 | 0.6% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 461 | |
| 1 | 263 | |
| 2 | 82 | 9.8% |
| 3 | 20 | 2.4% |
| 4 | 7 | 0.8% |
| 5 | 5 | 0.6% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 6 | 1 | 0.1% |
| 5 | 5 | 0.6% |
| 4 | 7 | 0.8% |
| 3 | 20 | 2.4% |
| 2 | 82 | 9.8% |
| 1 | 263 | |
| 0 | 461 |
TEXT_Keyrwords_ENT_PERSON
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 14 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.887961859 |
| Minimum | 0 |
|---|---|
| Maximum | 18 |
| Zeros | 262 |
| Zeros (%) | 31.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.256644502 |
|---|---|
| Coefficient of variation (CV) | 1.195280768 |
| Kurtosis | 8.863595042 |
| Mean | 1.887961859 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.305513681 |
| Sum | 1584 |
| Variance | 5.092444409 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 262 | |
| 1 | 208 | |
| 2 | 139 | |
| 3 | 74 | 8.8% |
| 4 | 67 | 8.0% |
| 5 | 34 | 4.1% |
| 6 | 24 | 2.9% |
| 7 | 9 | 1.1% |
| 10 | 6 | 0.7% |
| 8 | 5 | 0.6% |
| Other values (4) | 11 | 1.3% |
| Value | Count | Frequency (%) |
| 0 | 262 | |
| 1 | 208 | |
| 2 | 139 | |
| 3 | 74 | 8.8% |
| 4 | 67 | 8.0% |
| 5 | 34 | 4.1% |
| 6 | 24 | 2.9% |
| 7 | 9 | 1.1% |
| 8 | 5 | 0.6% |
| 9 | 4 | 0.5% |
| Value | Count | Frequency (%) |
| 18 | 2 | 0.2% |
| 15 | 1 | 0.1% |
| 11 | 4 | 0.5% |
| 10 | 6 | 0.7% |
| 9 | 4 | 0.5% |
| 8 | 5 | 0.6% |
| 7 | 9 | 1.1% |
| 6 | 24 | 2.9% |
| 5 | 34 | |
| 4 | 67 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| df_index | newsdesk | section | material | headline | abstract | keywords | word_count | pub_date | n_comments | uniqueID | print_section | print_page | lead_paragraph | headline.main | month | Has_Comments | comment_size | TEXT_LeadParagraph_POS_NOUN | TEXT_LeadParagraph_POS_PNOUN | TEXT_Keywords_POS_NOUN | TEXT_Keywords_POS_PNOUN | TEXT_headline.main_POS_NOUN | TEXT_headline.main_POS_PNOUN | TEXT_LeadParagraph_ENT_ORG | TEXT_Keywords_ENT_ORG | TEXT_LeadParagraph_ENT_NORP | TEXT_Keywords_ENT_NORP | TEXT_LeadParagraph_ENT_FAC | TEXT_Keywords_ENT_FAC | TEXT_LeadParagraph_ENT_GPE | TEXT_Keywords_ENT_GPE | TEXT_LeadParagraph_ENT_LOC | TEXT_Keywords_ENT_LOC | TEXT_LeadParagraph_ENT_PERSON | TEXT_Keyrwords_ENT_PERSON | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 13715 | Sports | Sports | News | Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | The Dodgers overwhelmed the Rays in Game 1, with Betts showing off the consistent excellence that led Los Angeles to sign him to a 12-year contract. | ['Baseball', 'World Series', 'Boston Red Sox', 'Los Angeles Dodgers', 'Tampa Bay Rays', 'Bellinger, Cody (1995- )', 'Betts, Mookie (1992- )', 'Kershaw, Clayton'] | 1083 | 2020-10-21 13:23:08+00:00 | 18 | nyt://article/905b23aa-991f-5f02-9a7b-f0a09a7cba27 | B | 10 | ARLINGTON, Texas — Mookie Betts was 24 years old when he wondered if he would ever be any better. He had just finished as runner-up to Mike Trout for the American League Most Valuable Player Award in 2016, with a season full of hits and homers and steals and defensive excellence. What could he do next? | Mookie Betts Leads Dodgers’ Stars With a Masterly Performance | 10 | 1 | S | 8 | 12 | 3 | 17 | 2 | 4 | 1 | 3 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 2 | 3 |
| 1 | 4455 | Well | Well | News | Taking Baths May Be Good for Your Heart | Daily baths reduced the risk of heart disease and stroke. | ['Bathing and Showering', 'Heart', 'Blood Pressure'] | 242 | 2020-04-01 16:34:18+00:00 | 14 | nyt://article/40e910f7-8d5f-5beb-9e7c-7ed27ac1597c | D | 6 | Taking frequent baths may reduce the risk for cardiovascular disease, new research suggests. | Taking Baths May Be Good for Your Heart | 4 | 1 | S | 4 | 0 | 3 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 1886 | Upshot | The Upshot | News | Young Men Embrace Gender Equality, but They Still Don’t Vacuum | New studies show traditional views persist about who does what at home, and it’s holding women back. | ['Work-Life Balance', 'Women and Girls', 'Research', 'Men and Boys', 'Parenting', 'Labor and Jobs', 'United States'] | 1315 | 2020-02-11 10:00:12+00:00 | 632 | nyt://article/d96041df-e929-51ad-8dd1-88bb84964888 | B | 5 | Young people today have become much more open-minded about gender roles — it shows up in their attitudes about pronouns, politics and sports. But in one area, change has been minimal. They are holding on to traditional views about who does what at home. | Young Men Embrace Gender Equality, but They Still Don’t Vacuum | 2 | 1 | L | 12 | 0 | 8 | 5 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 3 | 7305 | Foreign | World | News | 2 Weeks, 6.5 Million Coronavirus Tests as Wuhan Nears Goal | The Chinese city where the outbreak began is seeking to test all its 11 million residents, and the pandemic has forced the fashion industry to take a hard look in the mirror. | ['Coronavirus (2019-nCoV)'] | 3805 | 2020-05-26 04:05:17+00:00 | 61 | nyt://article/4154be08-a296-5881-8342-e9f223dc484c | NaN | NaN | 新冠病毒疫情最新消息 | 2 Weeks, 6.5 Million Coronavirus Tests as Wuhan Nears Goal | 5 | 1 | S | 0 | 1 | 0 | 2 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 12331 | OpEd | Opinion | Op-Ed | Is the Stock Market Rooting for Trump or Biden? | Neither. Wall Street is not as partisan as you think. | ['Presidential Election of 2020', 'United States Economy', 'Stocks and Bonds'] | 880 | 2020-09-21 09:00:11+00:00 | 317 | nyt://article/afed02eb-f035-542a-8791-c243893e65d7 | A | 23 | For months the S&P 500 rose this year — despite a deadly pandemic, the resulting economic devastation and the rise of a Democratic Party increasingly sympathetic to democratic socialism. Then, this month, with Joe Biden doing well in the polls, stock prices finally stumbled. | Is the Stock Market Rooting for Trump or Biden? | 9 | 1 | M | 10 | 5 | 3 | 5 | 0 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 5 | 6152 | Politics | U.S. | News | Why Biden’s Choice of Running Mate Has Momentous Implications | Joe Biden has hinted that he might serve only one term if he wins. That would set up a woman as the front-runner for 2024 and perhaps define the Democratic agenda for the next decade. | ['Presidential Election of 2020', 'Vice Presidents and Vice Presidency (US)', 'United States Politics and Government', 'Presidential Election of 2024', 'Biden, Joseph R Jr', 'Women and Girls', 'Democratic Party', 'Abrams, Stacey Y', 'Demings, Val', 'Grisham, Michelle Lujan', 'Harris, Kamala D', 'Klobuchar, Amy', 'Warren, Elizabeth', 'Rice, Susan E'] | 1725 | 2020-05-03 17:48:35+00:00 | 290 | nyt://article/80b0675a-7867-57f2-919e-8642e939f747 | A | 1 | WASHINGTON — For decades the vice-presidential selection process has had an air of cloak-and-dagger to it. The party’s nominees would say little about their thinking, the would-be running mates would reveal even less, and an elaborate game of subterfuge would unfold that mostly captivated political insiders and usually had little bearing on the election. | Why Biden’s Choice of Running Mate Has Momentous Implications | 5 | 1 | M | 16 | 1 | 2 | 38 | 1 | 4 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 10 |
| 6 | 14233 | Learning | The Learning Network | News | Lesson of the Day: ‘Lil Buck Feels the Dancing Spirit All Over Again’ | In this lesson, students watch a short film by the dancer Lil Buck, then consider his contention that street dance is fine art and can be “a tool to help bring change.” | [] | 788 | 2020-11-02 10:00:03+00:00 | 6 | nyt://article/0e73f3b9-b692-572f-85cd-6ab8831dfffc | NaN | NaN | Featured Article: “Lil Buck Feels the Dancing Spirit All Over Again,” by Gia Kourlas | Lesson of the Day: ‘Lil Buck Feels the Dancing Spirit All Over Again’ | 11 | 1 | S | 1 | 6 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 7 | 15496 | Styles | Fashion & Style | News | Alicia Keys Figures Out Her Skin | The singer-songwriter talks about finding her rituals (and learning how to deal with carbs). | ['Keys, Alicia', 'Keys Soulcare (ELF Cosmetics Inc)', 'Cosmetics and Toiletries', 'Skin', 'Exercise', 'Meditation', 'Quarantine (Life and Culture)', 'Content Type: Personal Profile', 'Rhythm and Blues (Music)'] | 1226 | 2020-12-01 17:13:41+00:00 | 16 | nyt://article/6ed0d121-21ca-54d5-b0cf-2e7cef434eb5 | D | 3 | Alicia Keys Figures Out Her Skin | 12 | 1 | S | 0 | 0 | 6 | 16 | 1 | 2 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | |
| 8 | 10579 | Weekend | Books | News | She Explains ‘Mansplaining’ With Help From 17th-Century Art | In her new book “Men to Avoid in Art and Life,” Nicole Tersigni harnesses her skill with a Twitter meme to illuminate the experience of women harassed by concern trolls, “sexperts” and more. | ['Books and Literature', 'Women and Girls', 'Comedy and Humor', 'Social Media', 'Writing and Writers', 'Art', 'Discrimination', 'Tersigni, Nicole', 'Men to Avoid in Art and Life (Book)'] | 893 | 2020-08-10 09:00:25+00:00 | 1523 | nyt://article/ef8de76b-cb6b-55e2-9a1c-1e8165c20cb7 | C | 12 | This story begins, as so many do these days, on Twitter. | She Explains ‘Mansplaining’ With Help From 17th-Century Art | 8 | 1 | L | 2 | 1 | 8 | 10 | 3 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
| 9 | 6338 | Podcasts | Podcasts | News | A Socially Distanced Senate | The two chambers of Congress received the same medical advice about reconvening. They took different decisions. | ['United States Politics and Government', 'Coronavirus (2019-nCoV)', 'House of Representatives', 'Senate', 'Quarantines', 'Washington (DC)', 'Capitol Building (Washington, DC)'] | 284 | 2020-05-06 09:59:07+00:00 | 6 | nyt://article/f229cb93-f6ab-5bef-bac4-89f3449d1721 | NaN | NaN | Listen and subscribe to our podcast from your mobile device:Via Apple Podcasts | Via Spotify | Via Stitcher | A Socially Distanced Senate | 5 | 1 | S | 4 | 6 | 1 | 15 | 0 | 2 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 |
Last rows
| df_index | newsdesk | section | material | headline | abstract | keywords | word_count | pub_date | n_comments | uniqueID | print_section | print_page | lead_paragraph | headline.main | month | Has_Comments | comment_size | TEXT_LeadParagraph_POS_NOUN | TEXT_LeadParagraph_POS_PNOUN | TEXT_Keywords_POS_NOUN | TEXT_Keywords_POS_PNOUN | TEXT_headline.main_POS_NOUN | TEXT_headline.main_POS_PNOUN | TEXT_LeadParagraph_ENT_ORG | TEXT_Keywords_ENT_ORG | TEXT_LeadParagraph_ENT_NORP | TEXT_Keywords_ENT_NORP | TEXT_LeadParagraph_ENT_FAC | TEXT_Keywords_ENT_FAC | TEXT_LeadParagraph_ENT_GPE | TEXT_Keywords_ENT_GPE | TEXT_LeadParagraph_ENT_LOC | TEXT_Keywords_ENT_LOC | TEXT_LeadParagraph_ENT_PERSON | TEXT_Keyrwords_ENT_PERSON | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 829 | 16119 | Travel | Travel | News | How to Pretend You’re in Singapore Tonight | You can feel like you are in the Lion City with a little work in the kitchen, the right book and some time in front of the TV. | ['Cooking and Cookbooks', 'Movies', 'Travel and Vacations', 'Food', 'Television', 'Restaurants', 'Books and Literature', 'Bourdain, Anthony', 'Kwan, Kevin', 'Liew, Sonny', 'Singapore'] | 1432 | 2020-12-15 10:00:25+00:00 | 64 | nyt://article/a511396b-640b-5be3-92e6-b0497bf6611e | NaN | NaN | It took over a dozen visits to Singapore for me to fall in love with it. But when I did, I fell hard. As a teenager living in Jakarta, Indonesia — just under two hours away by direct flight — I looked at Singapore’s shiny veneer and dismissed the whole place as shallow and materialistic. It was one big shopping mall, I thought, with too many rules and not enough character. But then, as I kept going back, I intentionally squashed my preconceptions and I started noticing other things. I quickly realized how much I had been missing. | How to Pretend You’re in Singapore Tonight | 12 | 1 | S | 14 | 4 | 5 | 11 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 2 |
| 830 | 12843 | Washington | U.S. | News | Trump Tests Positive for the Coronavirus | The president’s result came after he spent months playing down the severity of the outbreak that has killed more than 207,000 in the United States and hours after insisting that “the end of the pandemic is in sight.” | ['Trump, Donald J', 'Coronavirus (2019-nCoV)', 'United States Politics and Government', 'Presidents and Presidency (US)', 'Twenty-Fifth Amendment (US Constitution)', 'Presidential Election of 2020', 'White House Coronavirus Outbreak (2020)'] | 1914 | 2020-10-02 05:01:16+00:00 | 3707 | nyt://article/de217dd9-3383-574a-9eaf-c0303570e794 | NaN | NaN | [Read our live updates on President Trump’s coronavirus diagnosis.] | Trump Tests Positive for the Coronavirus | 10 | 1 | L | 2 | 3 | 1 | 22 | 0 | 3 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 2 |
| 831 | 6238 | Metro | New York | News | 2 Die From the Virus at a Bronx Bus Depot, and Drivers Are Rattled | On the front line of the city’s battle against the pandemic, bus drivers are struggling to come to terms with their role as essential workers. | ['Buses', 'Coronavirus (2019-nCoV)', 'Transit Systems', 'Bronx (NYC)', 'Workplace Hazards and Violations', 'Metropolitan Transportation Authority', 'New York City', 'New York City Transit Authority'] | 1686 | 2020-05-05 07:00:11+00:00 | 187 | nyt://article/769ee82a-293d-5203-bc8c-e37cae61c6c3 | A | 1 | Angel Volquez was already on edge. For weeks, the New York City bus driver had watched as the city’s grim new reality played out in front of him through the large glass windshield like a dystopian movie. | 2 Die From the Virus at a Bronx Bus Depot, and Drivers Are Rattled | 5 | 1 | M | 10 | 5 | 1 | 20 | 1 | 4 | 0 | 3 | 0 | 0 | 0 | 0 | 1 | 3 | 0 | 0 | 1 | 2 |
| 832 | 6615 | Learning | The Learning Network | News | Should Students Be Monitored When Taking Online Tests? | Is surveillance necessary to prevent students from cheating during online exams, or does it violate students’ privacy? | [] | 764 | 2020-05-12 09:00:05+00:00 | 301 | nyt://article/20c9eb36-46a9-565f-ba13-22e737a72eb9 | NaN | NaN | Has cheating on tests ever been a problem at your school? What about now that school has gone online? What steps do you think teachers and professors should take to ensure that students are completing online exams honestly? Is there a point when online surveillance impedes on students’ privacy — or even becomes “creepy”? | Should Students Be Monitored When Taking Online Tests? | 5 | 1 | M | 14 | 1 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 833 | 14143 | OpEd | Opinion | Op-Ed | The Woman President Who Wasn’t | Trump may have beaten Hillary Clinton, but the story doesn’t end there. | ['Presidential Election of 2020', 'Women and Girls', 'United States Politics and Government', 'Elections, Senate', "Women's Rights", 'Biden, Joseph R Jr', 'Senate', 'Trump, Donald J', 'Clinton, Hillary Rodham'] | 911 | 2020-10-30 09:00:14+00:00 | 171 | nyt://article/d90c1075-cad0-55e1-bd3b-015f40de473e | SR | 11 | One of my clearest memories of election night in 2016 is running into women who were going to watch the results with their daughters, so they’d get to share the experience of seeing Hillary Clinton elected president — the moment when the political glass ceiling in America would be shattered forever. | The Woman President Who Wasn’t | 10 | 1 | M | 11 | 3 | 4 | 20 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 6 |
| 834 | 1518 | Foreign | World | News | Beijing in the Time of Coronavirus: No Traffic, Empty Parks and Fear | The Chinese capital, like other cities far from the epidemic’s center, has imposed restrictions and shut down public spaces, straining the ties that bind society. | ['Coronavirus (2019-nCoV)', 'Beijing (China)', 'Shopping and Retail', 'Politics and Government', 'Epidemics', 'Parks and Other Recreation Areas', 'Communist Party of China', 'China'] | 1273 | 2020-02-03 16:06:58+00:00 | 18 | nyt://article/40edd806-d950-5b6f-a751-d7ddbc3e9f92 | A | 7 | BEIJING — The Apple stores were among the busiest places still open in Beijing after the coronavirus outbreak, though employees forbade customers to try the watches or AirPods. | Beijing in the Time of Coronavirus: No Traffic, Empty Parks and Fear | 2 | 1 | S | 7 | 4 | 4 | 13 | 1 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 1 |
| 835 | 9970 | Books | Books | News | The Celebrity Bookshelf Detective Is Back | We peer over the shoulders of Gwyneth Paltrow, Regina King, Charlamagne tha God, Yo-Yo Ma and others for a glimpse at their reading habits. | ['Celebrities', 'Quarantine (Life and Culture)', 'Books and Literature', 'Videophones and Videoconferencing', 'Hanks, Tom', 'LuPone, Patti', 'Penn, Sean', 'Paltrow, Gwyneth', 'Charlamagne Tha God', 'Powell, Colin L', 'King, Regina (1971- )', 'Ma, Yo-Yo'] | 1094 | 2020-07-27 09:00:25+00:00 | 156 | nyt://article/a4c60180-9a43-5fca-8109-8d5c3bc72787 | BR | 27 | Zoom happy hours are far less exciting now than they were in March. So is sourdough starter. A lot of the early preoccupations of our lockdown life don’t quite have the same charm anymore. (Remember the night we all made lasagna together?) But the chance to speculate about the minds and proclivities of the famous by gawking at their bookshelves never gets old. So after a first foray into bookshelf sleuthing, we’re back for more. | The Celebrity Bookshelf Detective Is Back | 7 | 1 | M | 15 | 2 | 3 | 25 | 0 | 3 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 4 |
| 836 | 5899 | Foreign | World | News | Genoa’s New Bridge Nears Completion, Turning Tragedy Into Hope | Nearly two years after 43 people died when a bridge collapsed, its replacement, built in record time, has become a symbol of Italian can-do. | ['Morandi Bridge (Genoa, Italy)', 'Bridges and Tunnels', 'Infrastructure (Public Works)', 'Piano, Renzo', 'Roads and Traffic', 'Conte, Giuseppe', 'Salini Impregilo SpA', 'Fincantieri', 'Genoa (Italy)'] | 1164 | 2020-04-28 14:23:28+00:00 | 56 | nyt://article/fd07ece0-584c-5199-bad0-177ff5cc972e | A | 18 | ROME — When the Morandi Bridge, a vital east-west transportation artery in the heart of Genoa, collapsed on Aug. 14, 2018, killing 43 people, there was little reason to think that its replacement would be in the final phases of construction less than two years later. | Genoa’s New Bridge Nears Completion, Turning Tragedy Into Hope | 4 | 1 | S | 10 | 6 | 4 | 17 | 0 | 6 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 5 |
| 837 | 3826 | The Upshot | The Upshot | Interactive Feature | Coronavirus Deaths by U.S. State and Country Over Time: Daily Tracker | Compare the number of deaths and the rate of increase over time in the places the virus has hit hardest so far. | ['Coronavirus (2019-nCoV)', 'States (US)', 'Epidemics', 'Deaths (Fatalities)', 'United States', 'Spain', 'France', 'Italy', 'China', 'South Korea', 'Germany'] | 0 | 2020-03-21 14:21:03+00:00 | 1761 | nyt://interactive/ac617f90-cbd1-5f17-abe4-db5bb065dce3 | NaN | NaN | Compare the number of deaths and the rate of increase over time in the places the virus has hit hardest so far. | Coronavirus Deaths by U.S. State and Country Over Time: Daily Tracker | 3 | 1 | L | 7 | 0 | 3 | 13 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 |
| 838 | 1564 | National | U.S. | News | Pets Are Just ‘Property,’ So Owners Can’t Do Much When Vets Harm Them | Doctors who harm their patients face costly lawsuits and other serious consequences. There is much less accountability for veterinarians, as devastated pet owners in Oregon learned. | ['Veterinary Medicine', 'Animal Abuse, Rights and Welfare', 'Pets', 'Dogs', 'Cats', 'California', 'Oregon', 'Koller, Daniel', 'Regulation and Deregulation of Industry', 'Suits and Litigation (Civil)'] | 1846 | 2020-02-04 10:00:27+00:00 | 293 | nyt://article/a55a55bd-8221-508e-a704-f5da1a6c9b3b | A | 12 | BEAVERTON, Ore. — After his dog Bleu sustained a leg injury over the summer, Andres Figueroa brought the 7-month-old dachshund in for a checkup at a sleek suburban clinic outside Portland, Ore., that was decorated with cutouts of cheerful pets. | Pets Are Just ‘Property,’ So Owners Can’t Do Much When Vets Harm Them | 2 | 1 | M | 10 | 7 | 6 | 13 | 3 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 3 | 2 | 0 | 0 | 2 | 2 |